Dimensionality reduction in patch-signature based protein structure matching
نویسندگان
چکیده
Searching bio-chemical structures is becoming an important application domain of information retrieval. This paper introduces a protein structure matching problem and formulates it as an information retrieval problem. We first present a novel vector representation for protein structures, in which a protein structural region, formed by the vectors within the region, is defined as a patch and indexed by its patch signature. For a k-sized patch, its patch signature consists of 7k − 10 inter-atom distances which uniquely determine the patch’s spatial structure. A patch matching function is then defined. As structures for proteins are large and complex, it is computationally expensive to identify possible matching patches for a given protein against a large protein database. We propose to apply dimensionality reduction to the patch signatures and show how the two problems are adapted to fit each other. The Locality Preservation Projection (LPP) and Singular Value Decomposition (SVD) are chosen and tested for this purpose. Experimental results show that the dimensionality reduction improves the searching speed while maintaining acceptable precision and recall. From a more general point of view, this paper demonstrates that information retrieval techniques can play a crucial role in solving this biologically critical but computationally expensive problem.
منابع مشابه
A Novel Patch-Based Digital Signature
In this paper a new patch-based digital signature (DS) is proposed. The proposed approach similar to steganography methods hides the secure message in a host image. However, it uses a patch-based key to encode/decode the data like cryptography approaches. Both the host image and key patches are randomly initialized. The proposed approach consists of encoding and decoding algorithms. The encodin...
متن کامل2D Dimensionality Reduction Methods without Loss
In this paper, several two-dimensional extensions of principal component analysis (PCA) and linear discriminant analysis (LDA) techniques has been applied in a lossless dimensionality reduction framework, for face recognition application. In this framework, the benefits of dimensionality reduction were used to improve the performance of its predictive model, which was a support vector machine (...
متن کاملVideo-Based Face Recognition Using Earth Mover's Distance
In this paper, we present a novel approach of using Earth Mover’s Distance for video-based face recognition. General methods can be classified into sequential approach and batch approach. Batch approach is to compute a similarity function between two videos. There are two classical batch methods. The one is to compute the angle between subspaces, and the other is to find K-L divergence between ...
متن کاملA Face Recognition Signature Combining Patch-based Features with Soft Facial Attributes
This paper focuses on improving face recognition performance with a new signature combining implicit facial features with explicit soft facial attributes. This signature has two components: the existing patch-based features and the soft facial attributes. A deep convolutional neural network adapted from state-of-the-art networks is used to learn the soft facial attributes. Then, a signature mat...
متن کاملImprovement and parallelization of Snort network intrusion detection mechanism using graphics processing unit
Nowadays, Network Intrusion Detection Systems (NIDS) are widely used to provide full security on computer networks. IDS are categorized into two primary types, including signature-based systems and anomaly-based systems. The former is more commonly used than the latter due to its lower error rate. The core of a signature-based IDS is the pattern matching. This process is inherently a computatio...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006